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Life arose on Earth sometime in the first few hundred million years 
after the young planet had cooled to the point that it could support 
water-based organisms on its surface. The early emergence of life 
on Earth has been taken as evidence that the probability of abiogen- 
esis is high, if starting from young-Earth-like conditions. We revisit 
this argument quantitatively in a Bayesian statistical framework. By 
constructing a simple model of the probability of abiogenesis, we 
calculate a Bayesian estimate of its posterior probability, given the 
data that life emerged fairly early in Earth's history and that, billions 
of years later, curious creatures noted this fact and considered its 
implications. We find that, given only this very limited empirical 
information, the choice of Bayesian prior for the abiogenesis proba- 
bility parameter has a dominant influence on the computed posterior 
probability. Although terrestrial life's early emergence provides evi- 
dence that life might be common in the Universe if early-Earth-like 
conditions are, the evidence is inconclusive and indeed is consistent 
with an arbitrarily low intrinsic probability of abiogenesis for plausible 
uninformative priors. Finding a single case of life arising indepen- 
dently of our lineage (on Earth, elsewhere in the Solar System, or 
on an extrasolar planet) would provide much stronger evidence that 
abiogenesis is not extremely rare in the Universe. 

Astrobiology 

Abbreviations: Gyr, gigayear (10 years); PDF, probability density function; CDF, 
cumulative distribution function 



Introduction 

Astrobiology is fundamentally concerned with whether ex- 
traterrestrial life exists and, if so, how abundant it is in the 
Universe. The most direct and promising approach to answer- 
ing these questions is surely empirical, the search for hfe on 
other bodies in the Solar System [T] [2] and beyond in other 
planetary systems [31 H]- Nevertheless, a theoretical approach 
is possible in principle and could provide a useful complement 
to the more direct lines of investigation. 

In particular, if we knew the probability per unit time 
and per unit volume of abiogenesis in a pre-biotic environ- 
ment as a function of its physical and chemical conditions 
and if we could determine or estimate the prevalence of such 
environments in the Universe, we could make a statistical esti- 
mate of the abundance of extraterrestrial life. This relatively 
straightforward approach is, of course, thwarted by our great 
ignorance regarding both inputs to the argument at present. 

There does, however, appear to be one possible way of fi- 
nessing our lack of detailed knowledge concerning both the 
process of abiogenesis and the occurrence of suitable pre- 
biotic environments (whatever they might be) in the Universe. 
Namely, we can try to use our knowledge that life arose at least 
once in an environment (whatever it was) on the early Earth 
to try to infer something about the probability per unit time 
of abiogenesis on an Earth-like planet without the need (or 
ability) to say how Earth-like it need be or in what ways. We 
will hereinafter refer to this probability per unit time, which 
can also be considered a rate, as A or simply the "probability 
of abiogenesis." 



Any inferences about the probability of life arising (given 
the conditions present on the early Earth) must be informed 
by how long it took for the first living creatures to evolve. By 
definition, improbable events generally happen infrequently. 
It follows that the duration between events provides a metric 
(however imperfect) of the probability or rate of the events. 
The time-span between when Earth achieved pre-biotic condi- 
tions suitable for abiogenesis plus generally habitable climatic 
conditions [5l (6] [7] and when life first arose, therefore, seems 
to serve as a basis for estimating A. Revisiting and quantifying 
this analysis is the subject of this paper. 

We note several previous quantitative attempts to address 
this issue in the literature, of which one [8| found, as we 
do, that early abiogenesis is consistent with life being rare, 
and the other W found that Earth's early abiogenesis points 
strongly to life being common on Earth-like planets (we com- 
pare our approach to the problem to that of ^9 below, in- 
cluding our significantly different results) rj Furthermore, an 
argument of this general sort has been widely used in a qual- 
itative and even intuitive way to conclude that A is unlikely 
to be extremely small because it would then be surprising for 
abiogenesis to have occurred as quickly as it did on Earth 
[l2l[T3l[ll[l5l[l6l[ni[Il]- Indeed, the early emergence of life 
on Earth is often taken as significant supporting evidence for 
"optimism" about the existence of extra-terrestrial life (i.e., 
for the view that it is fairly common) [191 1201 19]. The major 
motivation of this paper is to determine the quantitative va- 
lidity of this inference. We emphasize that our goal is not to 
derive an optimum estimate of A based on all of the many lines 
of available evidence, but simply to evaluate the implication 
of life's early emergence on Earth for the value of A. 



A Bayesian Formulation of the Calculation 

Bayes's theorem [2T] can be written as PfA^jD] = 
(P[r'|X] X Pprior[>I]) /Fl'D]. Here, we take X to be a model 
and T) to be data. In order to use this equation to evalu- 
ate the posterior probability of abiogenesis, we must specify 
appropriate M and "D. 
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There are two unpublished works f llOl and llll l, of which we became aware after submission 
of this paper, that also conclude that early life on Earth does not rule out the possibility that 
abiogenesis is improbable. 
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A Poisson or Uniform Rate Model. In considering the devel- 
opment of life on a planet, we suggest that a reasonable, if 
simplistic, model is that it is a Poisson process during a pe- 
riod of time from tmin until tmax- In this model, the conditions 
on a young planet preclude the development of life for a time 
period of tmin after its formation. Furthermore, if the planet 
remains lifeless until imax has elapsed, it will remain lifeless 
thereafter as well because conditions no longer permit life to 
arise. For a planet around a solar-type star, tmax is almost 
certainly < 10 Gyr (10 billion years, the main sequence life- 
time of the Sun) and could easily be a substantially shorter 
period of time if there is something about the conditions on 
a young planet that are necessary for abiogenesis. Between 
these limiting times, we posit that there is a certain probabil- 
ity per unit time (A) of life developing. For imin < t < tmax, 
then, the probability of life arising n times in time t is 



P[A, n,t] =Pf 



[A,n,f] = : 



-A(t-i,-ni„) {^{t — tmin)}' 



[1] 

where t is the time since the formation of the planet. 

This formulation could well be questioned on a number of 
grounds. Perhaps most fundamentally, it treats abiogenesis 
as though it were a single instantaneous event and implicitly 
assumes that it can occur in only a single way (i.e., by only a 
single process or mechanism) and only in one type of physical 
environment. It is, of course, far more plausible that abiogen- 
esis is actually the result of a complex chain of events that 
take place over some substantial period of time and perhaps 
via different pathways and in different environments. How- 
ever, knowledge of the actual origin of life on Earth, to say 
nothing of other possible ways in which it might originate, is 
so limited that a more complex model is not yet justified. In 
essence, the simple Poisson event model used in this paper 
attempts to "integrate out" all such details and treat abio- 
genesis as a "black box" process: certain chemical and phys- 
ical conditions as input produce a certain probability of life 
emerging as an output. Another issue is that A, the probabil- 
ity per unit time, could itself be a function of time. In fact, 
the claim that life could not have arisen outside the window 
(tmin, tmax) is tautamouut to sayiug that A = for t < tmin 
and for f > tmax. Instead of switching from to a fixed value 
instantaneously, A could exhibit a complicated variation with 
time. If so, however, P[A,n,t] is not represented by the Pois- 
son distribution and eq. (fl| is not valid. Unless a particular 
(non top-hat-function) time- variation of A is suggested on the- 
oretical grounds, it seems unwise to add such unconstrained 
complexity. 

A further criticism is that A could be a function of n: it 
could be that life arising once (or more) changes the probabil- 
ity per unit time of life arising again. Since we are primarily 
interested in the probability of life arising at all - i.e., the 
probability of n 7^ - we can define A simply to be the value 
appropriate for a prebiotic planet (whatever that value may 
be) and remain agnostic as to whether it differs for n > 1. 
Thus, within the adopted model, the probability of life aris- 
ing is one minus the probability of it not arising: 



son [A, 0, t\ 



-A(t-t„i„) 



A Minimum Evolutionary Time Constraint. Naively, the single 
datum informing our calculation of the posterior of A appears 
to be simply that life arose on Earth at least once, approxi- 
mately 3.8 billion years ago (give or take a few hundred million 
years). There is additional significant context for this datum, 
however. Recall that the standard claim is that, since life 
arose early on the only habitable planet that we have exam- 
ined for inhabitants, the probability of abiogenesis is proba- 
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All times are in Gyr. Two "Conservative" (Conserv . ) models are 
shown, to indicate that troquired rnay be limited either by a small 
value of tmax ( "Conserv. 1"), or by a large value of (Stovolvo 
( "Conserv. 2"). 

bly high (in our language, A is probably large). This stan- 
dard argument neglects a potentially important selection ef- 
fect, namely: On Earth, it took nearly 4 Gyr for evolution to 
lead to organisms capable of pondering the probability of life 
elsewhere in the Universe. If this is a necessary duration, then 
it would be impossible for us to find ourselves on, for example, 
a (~4.5-Gyr old) planet on which life first arose only after the 
passage of 3.5 billion years [22]. On such planets there would 
not yet have been enough time for creatures capable of such 
contemplations to evolve. In other words, if evolution requires 
3.5 Gyr for life to evolve from the simplest forms to intelligent, 
questioning beings, then we had to find ourselves on a planet 
where life arose relatively early, regardless of the value of A. 

In order to introduce this constraint into the calculation 
we define (5tovoivo as the minimum amount of time required af- 
ter the emergence of life for cosmologically curious creatures 
to evolve, temerge as the age of the Earth from when the earliest 
extant evidence of life remains (though life might have actu- 
ally emerged earlier), and to as the current age of the Earth. 
The data, then, are that life arose on Earth at least once, ap- 
proximately 3.8 billion years ago, anrfthat this emergence was 
early enough that human beings had the opportunity subse- 
quently to evolve and to wonder about their origins and the 
possibility of life elsewhere in the Universe. In equation form, 

demerge \ t^O 'Jf'cvolve. 



The Likelihood Term. We now seek to evaluate the P[I>|A1] 
term in Bayes's theorem. Let froquired = min[to — 
<5tovoive , tmax] . Our existeucc on Earth requires that life ap- 
peared within troquirod- In other words, trequired is the max- 
imum age that the Earth could have had at the origin of 
life in order for humanity to have a chance of showing up 
by the present. We define Se to be the set of all Earth-like 
worlds of age approximately to in a large, unbiased volume 
and L[t\ to be the subset of Ss on which life has emerged 
within a tinre t. L [trequired] is the set of planets on which 
life emerged early enough that creatures curious about abio- 
genesis could have evolved before the present (to), and, pre- 
suming temerge < trequired (which We kuOW WaS the CaSC for 

Earth), L [temerge] is the subset of //[trequired] ou whlch life 
emerged as quickly as it did on Earth. Correspondingly, Nsg , 
Nt^, and Nt^ are the respective numbers of planets in sets 
Se, 1/ [trequired], and L[temorgo]. The fractions ipt, = NtJNsf: 



An alternative way to derive equation jsl is to let E = 
^emerge" snd R = "abiogenesis occurred between t^^-ij 



'abiogenesis occurred between tj^in ^f^d 
-, and ircquircd' ^^ ^hen have, from 
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and (^tc = Nt^/Nsg are, respectively, the fraction of Earth- 
like planets on which life arose within trcquirod and the frac- 



tion on which life emerged within to 



The ratio r 



'/'tc/'^tr = ^tf,/Nt^ is the fraction of Lt^ on which life arose 
as soon as it did on Earth. Given that we had to find our- 
selves on such a planet in the set Lt^ in order to write and 
read about this topic, the ratio r characterizes the probability 
of the data given the model if the probability of intelligent 
observers arising is independent of the time of abiogenesis 
(so long as abiogenesis occurs before trcquirod)- (This last as- 
sumption might seem strange or unwarranted, but the effect 
of relaxing this assumption is to make it more likely that we 
would find ourselves on a planet with early abiogenesis and 
therefore to reduce our limited ability to infer anything about 
A from our observations.) Since (pt^ — 1 — Ppoisaon[A, 0, tomorgc] 
and (ft, = 1 — Ppoisson [A, 0, trcquirod], wc may write that 



P[V\M] = 



1 — oxp[— A(t 



cniorgc 



''UiinJJ 



1 — cxp[— A(t 



required 



t-min/J 



iftu 



< to 



is called the 



< tr, 



od (and PfOjAI] = otherwise). This 
'likelihood function," and represents theproba- 
bility of the observation(s), given a particular modelrl It is 
via this function that the data "condition" our prior oeliefs 
about A in standard Bayesian terminology. 

Limiting Behavior of the Likelihood. It is instructive to con- 
sider the behavior of equation (|3| in some interesting limits. 
For A(troquirod — tmin) ^ 1, the numerator and denominator 
of equation ([3| each go approximately as the argument of the 
exponential function; therefore, in this limit, the likelihood 
function is approximately constant: 



P[V\M] 



''emerge 



^niin 



required 



-t„ 



[4] 



This result is intuitively easy to understand as follows: If A 
is sufficiently small, it is overwhelmingly likely that abiogene- 
sis occurred only once in the history of the Earth, and by the 
assumptions of our model, the one event is equally likely to oc- 
cur at any time during the interval between tmin and troquirod. 
The chance that this will occur by tomcrgo is then just the 
fraction of that total interval that has passed by tomergo - the 
result given in equation Q. 

In the other limit, when A(tomorgc —tmin) 3> 1, the numer- 
ator and denominator of equation Q are both approximately 
1. In this case, the likelihood function is also approximately 
constant (and equal to unity). This result is even more in- 
tuitively obvious since a very large value of A implies that 
abiogenesis events occur at a high rate (given suitable condi- 
tions) and are thus likely to have occurred very early in the 
interval between tmin and troquirod- 

These two limiting cases, then, already reveal a key con- 
clusion of our analysis: the posterior distribution of A for 
both very large and very small values will have the shape of 
the prior, just scaled by different constants. Only when A is 
neither very large nor very small - or, more precisely, when 



A(to 



imin) « 1 - do the data and the prior both inform 



the posterior probability at a roughly equal level. 

The Bayes Factor. In this context, note that the probabil- 
ity in equation ([3| depends crucially on two time differences, 

ZXtl ^ temorgo tmin and L\t2 = troquirod tmin, and that the 

ratio of the likelihood function at large A to its value at small 
A goes roughly as 



7^: 



P[data|largeA] 
P [data| smallA] 
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Fig. 1. PDF and CDF of A for uniform, logarithmic, and inverse- 
uniform priors, for model Optimistic, with Amin = 10~''Gyr~^ 
and Amax = lO'^Gyr"^. Top: The dashed and solid curves repre- 
sent, respectively, the prior and posterior probability distribution 
functions (PDFs) of A under three different assumptions about the 
nature of the prior. The green curves are for a prior that is uniform 
on the range OGyr~^ < A < Amax ("Uniform"); the blue are for a 
prior that is uniform in the log of A on the range — 3 < log A < 3 
("Log (-3)"); and the red are for a prior that is uniform in A~^ on 
the interval IQ-^Gyr < A"! < lO^Gyr ("InvUnif (-3)"). Bottom: 
The curves represent the cumulative distribution functions (CDFs) 
of A. The ordinate on each curve represents the integrated probabil- 
ity from to the abscissa (color and line-style schemes are the same 
as in the top panel). For a uniform prior, the posterior CDF traces 
the prior almost exactly. In this case, the posterior judgment that 
A is probably large simply reflects the prior judgment of the dis- 
tribution of A. For the prior that is uniform in A~^ (InvUnif), the 
posterior judgment is quite opposite - namely, that A is probably 
quite small - but this judgment is also foretold by the prior, which 
is traced nearly exactly by the posterior. For the logarithmic prior, 
the datum (that life on Earth arose within a certain time window) 
does influence the posterior assessment of A, shifting it in the di- 
rection of making greater values of A more probable. Nevertheless, 
the posterior probability is ~12% that A < lGyr~^. Lower Amin 
and/or lower Amax would further increase the posterior probability 
of very low A, for any of the priors. 



7?. is called the Bayes factor or Bayes ratio and is sometimes 
employed for model selection purposes. In one conventional 
interpretation [23], 7?. < 10 implies no strong reason m the 
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the rules of conditional probability, P[£;|i?, A-l] = P[E, R\^4]/I'lR\^4]. Since E entails i{, 
the numerator on the right-hand side is simply equal to P[£/| A4], which means that the previous 
equation reduces to equation |3l. 
124 advances this claim based on theoretical arguments that are critically reevaluated in 1251 
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data alone to prefer the model in the numerator over the one 
in the denominator. For the problem at hand, this means that 
the datum does not justify preference for a large value of A 
over an arbitrarily small one unless equation (|5| gives a result 
larger than roughly ten. 

Since the likelihood function contains all of the informa- 
tion in the data and since the Bayes factor has the limiting 
behavior given in equation [5] our analysis in principle need 
not consider priors. If a small value of A is to be decisively 
ruled out by the data, the value of TZ must be much larger 
than unity. It is not for plausible choices of the parameters 
(see Table 1), and thus arbitrarily small values of A can only 
be excluded by some adopted prior on its values. Still, for 
illustrative purposes, we now proceed to demonstrate the in- 
fluence of various possible A priors on the A posterior. 



some other basis (other than the early emergence of life on 
Earth) that it is a hundred times less likely that A is less than 
10~^Gyr~'^ than that it is less than O.lGyr"^. The uniform 
in A~^ prior has the equivalent sort of preference for small A 
values. By contrast, the logarithmic prior is relatively "unin- 
formative" in standard Bayesian terminology and is equivalent 
to asserting that we have no prior information that informs 
us of even the order-of-magnitude of A. 

In our opinion, the logarithmic prior is the most appropri- 
ate one given our current lack of knowledge of the process(es) 
of abiogenesis, as it represents scale-invariant ignorance of the 
value of A. It is, nevertheless, instructive to carry all three pri- 
ors through the calculation of the posterior distribution of A, 
because they vividly illuminate the extent to which the result 
depends on the data vs the assumed prior. 



The Prior Term. To compute the desired posterior probability, 
what remains to be specified is Pprior[A1], the prior joint prob- 
ability density function (PDF) of A, imin, imax, and 5icvoivc- 
One approach to choosing appropriate priors for tmin, imax, 
and 5f evolve, would be to try to distill geophysical and pale- 
obiological evidence along with theories for the evolution of 
intelligence and the origin of life into quantitative distribution 
functions that accurately represent prior information and be- 
liefs about these parameters. Then, in order to ultimately 
calculate a posterior distribution of A, one would marginalize 
over these "nuisance parameters." However, since our goal 
is to evaluate the influence of life's early emergence on our 
posterior judgment of A (and not of the other parameters), 
we instead adopt a different approach. Rather than calculat- 
ing a posterior over this 4-dimensional parameter space, we 
investigate the way these three time parameters affect our in- 
ferences regarding A by simply taking their priors to be delta 
functions at several theoretically interesting values: a purely 
hypothetical situation in which life arose extremely quickly, 
a most conservative situation, and an in between case that is 
also optimistic but for which there does exist some evidence 
(see Table 1). 

For the values in Table 1, the likelihood ratio TZ varies 
from ~1.1 to 300, with the parameters of the "optimistic" 
model giving a borderline significance value of 7?. = 15. Thus, 
only the hypothetical case gives a decisive preference for large 
A by the Bayes factor metric, and we emphasize that there 
is no direct evidence that abiogenesis on Earth occurred that 
early, only 10 million years after conditions first permitted it PI 

We also lack a first-principles theory or other solid prior 
information for A. We therefore take three different functional 
forms for the prior - uniform in A, uniform in A~^ (equivalent 
to saying that the mean time until life appears is uniformly 
distributed), and uniform in log^Q A. For the uniform in A 
prior, we take our prior confidence in A to be uniformly dis- 
tributed on the interval to Amax ~ 1000 Gyr~^ (and to 
be otherwise). For the uniform in A~^ and the uniform in 
logjo[A] priors, we take the prior density functions for A^^ 
and logioi-*^], respectively, to be uniform on Amin < A < Amax 
(and otherwise). For illustrative purposes, we take three 
values of Amin: 10-^2Gyr-\ lO'^Gyr'S and 10-^Gyr-\ 
corresponding roughly to life occuring once in the observable 
Universe, once in our galaxy, and once per 200 stars (assuming 
one Earth-like planet per star). 

In standard Bayesian terminology, both the uniform in A 
and the uniform in A~^ priors are said to be highly "informa- 
tive." This means that they strongly favor large and small, 
respectively, values of A in advance, i.e., on some basis other 
than the empirical evidence represented by the likelihood term. 
For example, the uniform in A prior asserts that we know on 



Comparison with Previous Analysis. Using a binomial proba- 
bility analysis, Lineweaver & Davis 9! attempted to quantify 
q, the probability that life would arise within the first billion 
years on an Earth-like planet. Although the binomial distri- 
bution typically applies to discrete situations (in contrast to 
the continuous passage of time, during which life might arise), 
there is a simple correspondence between their analysis and 
the Poisson model described above. The probability that life 
would arise at least once within a billion years (what [S] call 
q) is a simple transformation of A, obtained from equation pj , 
with Ati = 1 Gyr: 



q — 1 — e 



-(A){lGyr) 



A = ln[l-g]/(lGyr). 



In the limit of A(lGyr) ^ 1, equation m implies that q 
is equal to A(lGyr). Though not cast in Bayesian terms, the 
analysis in [9] draws a Bayesian conclusion and therefore is 
based on an implicit prior that is uniform in q. As a result, it 
is equivalent to our uniform-A prior for small values of A (or 
g), and it is this implicit prior, not the early emergence of life 
on Earth, that dominates their conclusions. 



The Posterior Probability of Abiogenesis 

We compute the normalized product of the probability of the 
data given A (equation |3| with each of the three priors (uni- 
form, logarithmic, and inverse uniform). This gives us the 
Bayesian posterior PDF of A, which we also derive for each 
model in Table 1. Then, by integrating each PDF from — oo to 
A, we obtain the corresponding cumulative distribution func- 
tion (CDF). 

Figure 1 displays the results by plotting the prior and 
posterior probability of A. The top panel presents the PDF, 
and the bottom panel the CDF, for uniform, logarithmic, and 
inverse-uniform priors, for model Optimistic, which sets Aii 
(the maximum time it might have taken life to emerge once 
Earth became habitable) to 0.2 Gyr, and At2 (the time life 
had available to emerge in order that intelligent creatures 
would have a chance to evolve) to 3.0 Gyr. The dashed 
and solid curves represent, respectively, prior and posterior 
probability functions. In this figure, the priors on A have 
Amin = 10"^Gyr"^ and Amax = lO^Gyr"^ The green, blue, 
and red curves are calculated for uniform, logarithmic, and 
inverse-uniform priors, respectively. The results of the corre- 
sponding calculations for the other models and bounds on the 
assumed priors are presented in the Supporting Information, 
but the cases shown in Fig. 1 suffice to demonstrate all of the 
important qualitative behaviors of the posterior. 

In the plot of differential probability (PDF; top panel), it 
appears that the inferred posterior probabilities of different 
values of A are conditioned similarly by the data (leading to 
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10 Gyr . A discovery that life arose inde- 



Amin = 10--^Gyr-\ A^ 

pendently on Mars and Earth or on an exoplanet and Earth - or that it arose a second, 
independent, time on Earth - would significantly reduce the posterior probability of 
low A. 
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Fig. 3. Lower bound on A for logarithmic prior, Hypothetical model. Thie 
three curves depict median (50%), 1-(J (68.3%), and 2-0" (95.4%) lower bounds on 
A, as a function of Amin ■ 



a jump in the posterior PDF of roughly an order of magni- 
tude in the vicinity of A ~ 0.5 Gyr~^). The plot of cumulative 
probability, however, immediately shows that the uniform and 
the inverse priors produce posterior CDFs that are completely 
insensitive to the data. Namely, small values of A are strongly 
excluded in the uniform in A prior case and large values are 
equally strongly excluded by the uniform in A~^ prior, but 
these strong conclusions are not a consequence of the data, 
only of the assumed prior. This point is particularly salient, 
given that a Bayesian interpretation of [9] indicates an im- 
plicit uniform prior. In other words, their conclusion that q 
cannot be too small and thus that life should not be too rare 
in the Universe is not a consequence of the evidence of the 
early emergence of life on the Earth but almost only of their 
particular parameterization of the problem. 

For the Optimistic parameters, the posterior CDF com- 
puted with the uninformative logarithmic prior does reflect 
the influence of the data, making greater values of A more 
probable in accordance with one's intuitive expectations. 
However, with this relatively uninformative prior, there is a 
significant probability that A is very small (12% chance that 
A < IGyr"'^). Moreover, if we adopted smaller Amin, smaller 



Amax, and/or a larger Ati/At2 ratio, the posterior probability 
of an arbitrarily low A value can be made acceptably high (see 
Fig. 3 and the Supporting Information). 

Independent Abiogenesis. We have no strong evidence that 
life ever arose on Mars (although no strong evidence to the 
contrary either). Recent observations have tenatively sug- 
gested the presence of methane at the level of ~20 parts per 
billion (ppb) 26 , which could potentially be indicative of bi- 
ological activity. The case is not entirely clear, however, as 
alternative analysis of the same data suggests that an upper 
limit to the methane abundance is in the vicinity of ~3 ppb 
[27j . If, in the future, researchers find compelling evidence 
that Mars or an exoplanet hosts life that arose independently 
of life on Earth (or that life arose on Earth a second, inde- 
pendent time [281 I29|). how would this affect the posterior 
probability density of A (assuming that the same A holds for 
both instances of abiogenesis)? 

If Mars, for instance, and Earth share a single A and life 
arose arise on Mars, then the likelihood of Mars' A is the joint 
probability of our data on Earth and of life arising on Mars. 
Assuming no panspermia in either direction, these events are 
independent: 



V\V\M\ = (l - exp[-A(Cmr.. - t 
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For Mars, we take t^^ = t^rScrgc = 1 Gyr and Ctn" = 
0.5 Gyr. The posterior cumulative probability distribution 
of A, given a logarithmic prior between 0.001 Gyr~^ and 
1000 Gyr"'^, is as represented in Fig. 2 for the case of find- 
ing a second, independent sample of life and, for compari- 
son, the Optimistic case for Earth. Should future researchers 
find that life arose independently on Mars (or elsewhere), this 
would dramatically reduce the posterior probability of very 
low A relative to our current inferences. 

Arbitrarily Low Posterior Probability of A. We do not actu- 
ally know what the appropriate lower (or upper) bounds on 
A are. Figure 3 portrays the influence of changing Amin on 
the median posterior estimate of A, and on 1-cr and 2-a confl- 
dence lower bounds on posterior estimates of A. Although the 
median estimate is relatvely insensitive to Amin, a 2-a lower 
bound on A becomes arbitrarily low as Amin decreases. 



Conclusions 

Within a few hundred million years, and perhaps far more 
quickly, of the time that Earth became a hospitable location 
for life, it transitioned from being merely habitable to being 
inhabited. Recent rapid progress in exoplanet science sug- 
gests that habitable worlds might be extremely common in 
our galaxy 30, 31, 32, 33 , which invites the question of how 
often life arises, given habitable conditions. Although this 
question ultimately must be answered empirically, via searches 
for biomarkers [31] or for signs of extraterrestrial technology 
|35] , the early emergence of life on Earth gives us some infor- 
mation about the probability that abiogenesis will result from 
early-Earth-like conditions rl 

A Bayesian approach to estimating the probability of abio- 
genesis clarifies the relative infiuence of data and of our prior 



We note that the comparatively very late emergence of radio technology on Earth could, anal- 
ogously, be taken as an indication (albeit a vi/eak one because of our single datum) that radio 
technology might be rare in our galaxy. 
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beliefs. Although a "best guess" of the probability of abio- 
genesis suggests that life should be common in the Galaxy 
if early-Earth-like conditions are, still, the data are consis- 
tent (under plausible priors) with life being extremely rare, as 
shown in Figure 3. Thus, a Bayesian enthusiast of extrater- 
restrial life should be significantly encouraged by the rapid 
appearance of life on the early Earth but cannot be highly 
confident on that basis. 

Our conclusion that the early emergence of life on Earth 
is consistent with life being very rare in the Universe for plau- 
sible priors is robust against two of the more fundamental 
simplifications in our formal analysis. First, we have assumed 
that there is a single value of A that applies to all Earth-like 
planets (without specifying exactly what we mean by "Earth- 
like"). If A actually varies from planet to planet, as seems 
far more plausible, anthropic-like considerations imply plan- 
ets with particularly large A values will have a greater chance 
of producing (intelligent) life and of life appearing relatively 
rapidly, i.e., of the circumstances in which we find ourselves. 
Thus, the information we derive about A from the existence 
and early appearance of life on Earth will tend to be biased 
towards large values and may not be representative of the 
value of A for, say, an "average" terrestrial planet orbiting 
within the habitable zone of a main sequence star. Second, 
our formulation of the problem analyzed in this paper im- 
plicitly assumes that there is no increase in the probability 
of intelligent life appearing once (Jtcvoivc has elapsed following 
the abiogenesis event on a planet. A more reasonable model 
in which this probability continues to increase as additional 
time passes would have the same qualitative effect on the cal- 
culation as increasing 5tovoivo- In other words, it would make 



the resulting posterior distribution of A even less sensitive to 
the data and more highly dependent on the prior because it 
would make our presence on Earth a selection bias favoring 
planets on which abiogenesis occurred quickly. 

We had to find ourselves on a planet that has life on it, 
but we did not have to find ourselves (i) in a galaxy that has 
life on a planet besides Earth nor (m) on a planet on which 
life arose multiple, independent times. Learning that either 
(i) or (m) describes our world would constitute data that are 
not subject to the selection effect described above. In short, if 
we should find evidence of life that arose wholly idependently 
of us - either via astronomical searches that reveal life on an- 
other planet or via geological and biological studies that find 
evidence of life on Earth with a different origin from us - we 
would have considerably stronger grounds to conclude that 
life is probably common in our galaxy. With this in mind, 
research in the fields of astrobiology and origin of life stud- 
ies might, in the near future, help us to significantly refine 
our estimate of the probability (per unit time, per Earth-like 
planet) of abiogenesis. 
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Supplementary Material 

Formal Derivation of the Posterior Probability of AbiogenesisLet ta be the time of abiogenesis and ti be the time of the 
emergence of intelligence (to, tomorgo, and irequired are as defined in the text: to is the current age of the Earth; iemerge is the 
upper limit on the age of the Earth when life first arose; and ircquircd is the maximum age the Earth could have had when life 
arose in order for it to be possible for sentient beings to later arise by to). By "intelligence", we mean organisms that think 
about abiogenesis. Furthermore, let 

J^ — t^min ^ f^a ^ t^emerge 
-rt ('mill *^ ''a ^ (-required 

I = imin < ti < to 

M — "The Poisson rate parameter has value A" 

We assert (perhaps somewhat unreasonably) that the probability of intelligence arising (J) is independent of the actual 
time of abiogeneis (ta), so long as life shows up within troquired (R)- 

P[I\R,M,ta]^P[I\R,M]. [8] 

Although the probability of intelligence arising could very well be greater if abiogenesis occurs earlier on a world, the conse- 
quence of relaxing this assertion (discussed in the Conclusion and elsewhere in the text) is to increase the posterior probability 
of arbitrarily low A. 

Using the conditional version of Bayes's theorem, 

pr, IP M n_ 'P[I\R,M,ta]xP[t,\R,M] 

P[t^\R,M,I]- p[ilR^M] ' ^^^ 

and Eq. (Is]) then implies that P[ta\R,M,I] = P[ta\R,M]- An immediate result of this is that 

P[E\R,M,I]=P[E\R,M]. [10] 

We now apply Bayes's theorem again to get the posterior probability of A, given our circumstances and our observations: 

Note that, since t^ < tr, E ^ R. And, as discussed in the main text, P[E\R,M\ — P[E\M]/P[R\M.]. Finally, since we had 
to find ourselves on a planet on which R and I hold, these c ond itions tell us nothing about the value of A. In other words, 



P[A^j-R, I] — P[A^]. We therefore use Eq. ( 10 1 to rewrite Eq. (Ill as the posterior probability implied in the text 



P[M\E,I]= ^'"^'pj^^^j . [12] 

Model-Dependence of Posterior Probability of Abiogenesis In the main text, we demonstrated the strong dependence of the 
posterior probability of life on the form of the prior for A. Here, we present a suite of additional calculations, for different 
bounds to A and for different values of At i and Ai2 . 

Figure 4 displays the results of analogous calculations to those of Fig. 1, for three sets model of parameters (Hypothetical, 
Optimistic, Conservative) and for three values of Amin (10~^^Gyr^^, lO^^^Gyr"^, 10~^Gyr~^). For all three models, the 
posterior CDFs for the uniform and the inverse-uniform priors almost exactly match the prior CDFs, and, hence, are almost 
completely insensitive to the data. For the Conservative model (in which Aii = 0.8 Gyr and At2 = 0.9 Gyr - certainly not 
ruled out by available data), even the logarithmic prior's CDF is barely sensitive to the observation that there is life on Earth. 

Finally, the effect of 5iovoive - the minimum timescale required for sentience to evolve - is to impose a selection effect that 
becomes progressively more severe as tJfevoive approaches to — femerge. Figure 5 makes this point vividly. For the Optimistic 
model, posterior probabilities are shown as color maps as functions of A (abscissa) and iJtevoive (ordinate). At each horizontal 
cut across the PDF plots (left column), the values integrate to unity, as expected for a proper probability density function. For 
short values of Jtovoive, the selection effect (that intelligent creatures take some time to evolve) is unimportant, and the data 
might be somewhat informative about the true distribution of A. For larger values of 5ievoive, the selection effect becomes more 
important, to the point that the probability of the data given A approaches 1, and the posterior probability approaches the 
prior. 
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Fig. 4. PDF (/e/t) and CDF {right) of A for uniform, logarithmic, and inverse-uniform priors, for models Hypothetical (top). Optimistic {middle), 
Conservative {bottom). Curves are show/n for priors w/ith X^i^ = lO^^'^Gyr^^, An^in = lO^^lGyr"^, and Aj^in = 10~^Gyr~^. The uniform and 
inverse-uniform priors lead to CDFs that are completely insensitive to the data for all three models: for the Conservative model, even the logarithmic prior is insesnsitive 
to the data. 
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